虽然考试风格的问题是一家提供各种目的的基本型教育工具,但有问题的手动构建是一个复杂的过程,需要培训,经验和资源。为减少与人工建设相关的开支并满足不需要持续供应新问题,可以使用自动问题(QG)技术。但是,与自动问题应答(QA)相比,QG是一个更具挑战性的任务。在这项工作中,我们在QA,QG的多任务设置中微调多语言T5(MT5)变压器,并使用土耳其QA DataSet回答提取任务。据我们所知,这是第一个尝试从土耳其语文本执行自动文本到文本问题的学术工作。评估结果表明,拟议的多任务设置达到了最先进的土耳其语问题应答和问题绩效,而不是TQuadv1,TQuadv2数据集和XQuad土耳其分裂。源代码和预先训练的模型可在https://github.com/obss/turkish-question-generation中获得。
translated by 谷歌翻译
在本文中,我们提出了一种设计用于图形域的域适配算法。给定具有许多标记节点的源图和具有少数或没有标记节点的目标图,我们的目标是通过利用两个图表上标签函数的变化的特征之间的相似性来估计目标标签。我们对源和目标域的假设是标签函数的本地行为,例如图表上的速度和变化的变化速度,在两个图形之间存在相似之处。我们通过求解标签信息基于之前的标签函数的投影在源图和目标图之间类似地将标签信息从源图传输到目标图来求解从源图到目标图的优化问题来估计未知的目标标签。为了有效地捕获图形上标签函数的局部变化,光谱图小波用作图形基础。与参考域适配方法相比,各种数据集的实验表明,该方法产生了相当令人满意的分类精度。
translated by 谷歌翻译
For small training set sizes $P$, the generalization error of wide neural networks is well-approximated by the error of an infinite width neural network (NN), either in the kernel or mean-field/feature-learning regime. However, after a critical sample size $P^*$, we empirically find the finite-width network generalization becomes worse than that of the infinite width network. In this work, we empirically study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$. We find that finite-size effects can become relevant for very small dataset sizes on the order of $P^* \sim \sqrt{N}$ for polynomial regression with ReLU networks. We discuss the source of these effects using an argument based on the variance of the NN's final neural tangent kernel (NTK). This transition can be pushed to larger $P$ by enhancing feature learning or by ensemble averaging the networks. We find that the learning curve for regression with the final NTK is an accurate approximation of the NN learning curve. Using this, we provide a toy model which also exhibits $P^* \sim \sqrt{N}$ scaling and has $P$-dependent benefits from feature learning.
translated by 谷歌翻译
Multi-task learning (MTL) is a learning paradigm to learn multiple related tasks simultaneously with a single shared network where each task has a distinct personalized header network for fine-tuning. MTL can be integrated into a federated learning (FL) setting if tasks are distributed across clients and clients have a single shared network, leading to personalized federated learning (PFL). To cope with statistical heterogeneity in the federated setting across clients which can significantly degrade the learning performance, we use a distributed dynamic weighting approach. To perform the communication between the remote parameter server (PS) and the clients efficiently over the noisy channel in a power and bandwidth-limited regime, we utilize over-the-air (OTA) aggregation and hierarchical federated learning (HFL). Thus, we propose hierarchical over-the-air (HOTA) PFL with a dynamic weighting strategy which we call HOTA-FedGradNorm. Our algorithm considers the channel conditions during the dynamic weight selection process. We conduct experiments on a wireless communication system dataset (RadComDynamic). The experimental results demonstrate that the training speed with HOTA-FedGradNorm is faster compared to the algorithms with a naive static equal weighting strategy. In addition, HOTA-FedGradNorm provides robustness against the negative channel effects by compensating for the channel conditions during the dynamic weight selection process.
translated by 谷歌翻译
Recently, a surge of high-quality 3D-aware GANs have been proposed, which leverage the generative power of neural rendering. It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion. Although with the facial prior preserved in pre-trained 3D GANs, reconstructing a 3D portrait with only one monocular image is still an ill-pose problem. The straightforward application of 2D GAN inversion methods focuses on texture similarity only while ignoring the correctness of 3D geometry shapes. It may raise geometry collapse effects, especially when reconstructing a side face under an extreme pose. Besides, the synthetic results in novel views are prone to be blurry. In this work, we propose a novel method to promote 3D GAN inversion by introducing facial symmetry prior. We design a pipeline and constraints to make full use of the pseudo auxiliary view obtained via image flipping, which helps obtain a robust and reasonable geometry shape during the inversion process. To enhance texture fidelity in unobserved viewpoints, pseudo labels from depth-guided 3D warping can provide extra supervision. We design constraints aimed at filtering out conflict areas for optimization in asymmetric situations. Comprehensive quantitative and qualitative evaluations on image reconstruction and editing demonstrate the superiority of our method.
translated by 谷歌翻译
提取复杂刺激的潜在来源对于理解世界至关重要。尽管大脑不断解决这种盲源分离(BSS)问题,但其算法仍然未知。先前关于生物学上可行的BSS算法的工作假设观察到的信号是统计独立或不相关的源的线性混合物,从而限制了这些算法的适用性域。为了克服这一局限性,我们提出了新型的生物学上的神经网络,以盲目地分离潜在的依赖/相关来源。与以前的工作不同,我们假设源向量的一般几何形状,而不是统计条件,允许分离潜在的依赖/相关源。具体而言,我们假设源矢量足够散布在其域中,可以用某些多面体描述。然后,我们考虑通过det-Max标准恢复这些源,这使输出相关矩阵的决定因素最大化,以实施类似的传播源估计值。从这个规范性原理开始,并使用加权相似性匹配方法,该方法可以通过本地学习规则适应任意线性转换,我们得出了两层覆盖生物学上可见的神经网络算法,这些神经网络算法可以将混合物分离为来自各种源域的来源。我们证明,我们的算法在相关的源分离问题上优于其他生物学上的BSS算法。
translated by 谷歌翻译
植物是动态生物。对于野外所有机器人来说,了解植被的时间变化是一个必不可少的问题。但是,在时间上关联重复的3D植物扫描是具有挑战性的。此过程中的关键步骤是随着时间的推移重新识别和跟踪相同的单个植物组件。以前,这是通过比较其全球空间或拓扑位置来实现的。在这项工作中,我们演示了使用形状功能如何改善颞器官匹配。我们提出了一种无里程碑的形状压缩算法,该算法允许提取叶子的3D形状特征,在少数参数中有效地表征叶片形状和曲率,并使特征空间中各个叶子的关联成为可能。该方法使用主成分分析(PCA)结合了3D轮廓提取和进一步的压缩,以产生形状空间编码,这完全是从数据中学到的,并保留有关边缘轮廓和3D曲率的信息。我们对番茄植物的时间扫描序列的评估表明,结合形状特征可改善颞叶匹配。形状,位置和旋转信息的结合证明了最有用的信息,可以随着时间的推移识别叶子,并产生75%的真正正率,对固定方法提高了15%。这对于机器人作物监测至关重要,这可以实现全面的表型。
translated by 谷歌翻译
大脑中的早期感觉系统迅速适应波动的输入统计,这需要神经元之间的反复通信。从机械上讲,这种复发的通信通常是间接的,并由局部中间神经元介导。在这项工作中,我们探讨了与直接复发连接相比,通过中间神经元进行反复通信的计算益处。为此,我们考虑了两个在统计上使其输入的数学上可行的复发性神经网络 - 一种具有直接复发连接,另一个带有介导经常性通信的中间神经元。通过分析相应的连续突触动力学并在数值上模拟网络,我们表明,具有中间神经元的网络比具有直接复发连接的网络更适合初始化,这是因为与Interneurons网络中的突触动态的收敛时间(RESS)(RESS)( 。直接复发连接)与它们初始化的频谱以对数(线性分析)进行对数缩放。我们的结果表明,中间神经元在计算上对于快速适应更改输入统计的有用。有趣的是,具有中间神经元的网络是通过直接复发连接网络的美白目标的过度参数化解决方案,因此我们的结果可以看作是在过度参数化的前馈线性线性网络中观察到的隐式加速现象的复发性神经网络模拟。
translated by 谷歌翻译
阴影对于逼真的图像合成至关重要。基于物理的阴影渲染方法需要3D几何形状,这并不总是可用。基于深度学习的阴影综合方法从光信息到对象的阴影中学习映射,而无需明确建模阴影几何形状。尽管如此,它们仍然缺乏控制,并且容易出现视觉伪像。我们介绍了Pixel Heigh,这是一种新颖的几何表示,它编码对象,地面和相机姿势之间的相关性。像素高度可以根据3D几何形状计算,并在2D图像上手动注释,也可以通过有监督的方法从单视RGB图像中预测。它可用于根据投影几何形状计算2D图像中的硬阴影,从而精确控制阴影的方向和形状。此外,我们提出了一个数据驱动的软影子生成器,以基于软性输入参数将软性应用于硬阴影。定性和定量评估表明,所提出的像素高度显着提高了阴影产生的质量,同时允许可控性。
translated by 谷歌翻译
已知量子计算机可以在某些专业设置中使用经典的最先进的机器学习方法提供加速。例如,已证明量子内核方法可以在离散对数问题的学习版本上提供指数加速。了解量子模型的概括对于实现实际利益问题的类似加速至关重要。最近的结果表明,量子特征空间的指数大小阻碍了概括。尽管这些结果表明,量子模型在量子数数量较大时无法概括,但在本文中,我们表明这些结果依赖于过度限制性的假设。我们通过改变称为量子内核带宽的超参数来考虑更广泛的模型。我们分析了大量限制,并为可以以封闭形式求解的量子模型的概括提供了明确的公式。具体而言,我们表明,更改带宽的值可以使模型从不能概括到任何目标函数到对准目标的良好概括。我们的分析表明,带宽如何控制内核积分操作员的光谱,从而如何控制模型的电感偏置。我们从经验上证明,我们的理论正确地预测带宽如何影响质量模型在具有挑战性的数据集上的概括,包括远远超出我们理论假设的数据集。我们讨论了结果对机器学习中量子优势的含义。
translated by 谷歌翻译